Policy search in kernel Hilbert space

نویسندگان

  • J. Andrew. Bagnell
  • Jeff Schneider
چکیده

Much recent work in reinforcement learning and stochastic optimal control has focused on algorithms that search directly through a space of policies rather than building approximate value functions. Policy search has numerous advantages: it does not rely on the Markov assumption, domain knowledge may be encoded in a policy, the policy may require less representational power than a value-function approximation, and stable and convergent algorithms are well-understood. In contrast with value-function methods, however, existing approaches to policy search have heretofore focused entirely on parametric approaches. This places fundamental limits on the kind of policies that can be represented. In this work, we show how policy search (with or without the additional guidance of value-functions) in a Reproducing Kernel Hilbert Space gives a simple and rigorous extension of the technique to non-parametric settings. In particular, we investigate a new class of algorithms which generalize RElNFORCE-style likelihood ratio methods to yield both online and batch techniques that perform gradient search in a function space of policies. Further, we describe the computational tools that allow efficient implementation. Finally, we apply our new techniques towards interesting reinforcement learning problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reproducing Kernel Space Hilbert Method for Solving Generalized Burgers Equation

In this paper, we present a new method for solving Reproducing Kernel Space (RKS) theory, and iterative algorithm for solving Generalized Burgers Equation (GBE) is presented. The analytical solution is shown in a series in a RKS, and the approximate solution u(x,t) is constructed by truncating the series. The convergence of u(x,t) to the analytical solution is also proved.

متن کامل

Solving multi-order fractional differential equations by reproducing kernel Hilbert space method

In this paper we propose a relatively new semi-analytical technique to approximate the solution of nonlinear multi-order fractional differential equations (FDEs). We present some results concerning to the uniqueness of solution of nonlinear multi-order FDEs and discuss the existence of solution for nonlinear multi-order FDEs in reproducing kernel Hilbert space (RKHS). We further give an error a...

متن کامل

Solving Fuzzy Impulsive Fractional Differential Equations by Reproducing Kernel Hilbert Space Method

The aim of this paper is to use the Reproducing kernel Hilbert Space Method (RKHSM) to solve the linear and nonlinear fuzzy impulsive fractional differential equations. Finding the numerical solutionsof this class of equations are a difficult topic to analyze. In this study, convergence analysis, estimations error and bounds errors are discussed in detail under some hypotheses which provi...

متن کامل

Policy Search in Reproducing Kernel Hilbert Space

Modeling policies in reproducing kernel Hilbert space (RKHS) renders policy gradient reinforcement learning algorithms non-parametric. As a result, the policies become very flexible and have a rich representational potential without a predefined set of features. However, their performances might be either non-covariant under reparameterization of the chosen kernel, or very sensitive to step-siz...

متن کامل

Online Relative Entropy Policy Search using Reproducing Kernel Hilbert Space Embeddings

Kernel methods have been successfully applied to reinforcement learning problems to address some challenges such as high dimensional and continuous states, value function approximation and state transition probability modeling. In this paper, we develop an online policy search algorithm based on a recent state-of-the-art algorithm REPS-RKHS that uses conditional kernel embeddings. Our online al...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003